4 research outputs found

    Efficient Analysis of Complex Diagrams using Constraint-Based Parsing

    Full text link
    This paper describes substantial advances in the analysis (parsing) of diagrams using constraint grammars. The addition of set types to the grammar and spatial indexing of the data make it possible to efficiently parse real diagrams of substantial complexity. The system is probably the first to demonstrate efficient diagram parsing using grammars that easily be retargeted to other domains. The work assumes that the diagrams are available as a flat collection of graphics primitives: lines, polygons, circles, Bezier curves and text. This is appropriate for future electronic documents or for vectorized diagrams converted from scanned images. The classes of diagrams that we have analyzed include x,y data graphs and genetic diagrams drawn from the biological literature, as well as finite state automata diagrams (states and arcs). As an example, parsing a four-part data graph composed of 133 primitives required 35 sec using Macintosh Common Lisp on a Macintosh Quadra 700.Comment: 9 pages, Postscript, no fonts, compressed, uuencoded. Composed in MSWord 5.1a for the Mac. To appear in ICDAR '95. Other versions at ftp://ftp.ccs.neu.edu/pub/people/futrell

    Understanding Diagrams in Technical Documents

    No full text
    The ultimate goal of document analysis is to go from hardcopies to a computer-based knowledge bank representing the documents' contents. In technical documents, diagrams often play a critical role. An approach is described which can analyze diagrams to yield structural descriptions for them. The approach combines grammatical and constraint-based techniques in a single scheme called Graphics Constraint Grammars. The most important constraints are the Generalized Equivalence Relations which efficiently recognize conceptual groupings of objects. Keywords -- computer vision, constraint satisfaction, diagrams, document analysis, knowledge representation, parsing Converting documents to knowledge bases requires that a computer function as an intelligent "reader" or "viewer" of the documents. This is an artificial intelligence task involving computer vision and natural language understanding. The Biological Knowledge Laboratory at Northeastern University is developing such a system. Its goal ..

    Understanding Technical Articles and Their Diagrams

    No full text
    A comprehensive system is being developed to transform scientific research papers into a knowledge base. An intelligent Scientist's Assistant is also being developed to allow a scientist to navigate through the knowledge base. Starting from paper documents, both text and graphics are captured and analyzed. We describe the entire system and then focus on the analysis of diagrams such as data graphs. Model-based image processing discovers graphical objects such as lines, polygons and text. The model-based methods are necessary to deal with occlusion such as data points lying on data lines, or data lines which collide with graph axes. The analysis is assisted by the use of a spatial index. Then a syntactic Graphics Constraint Grammar (GCG) is used to parse the collection of objects to identify higher-level structures. Each grammar rule has three components: (1) a production specifying a parent and its constituents, (2) a set of constraints which are typically geometrical relations that th..
    corecore